TeiVM2, Main, Exploration, bibRecord, 000266

The Czech National Corpus: Principles, Design, and Results

Identifieur interne : 000266 ( Main/Exploration ); précédent : 000265; suivant : 000267

The Czech National Corpus: Principles, Design, and Results

Auteurs : Karel Kucera [République tchèque]

Source :

Literary and Linguistic Computing [ 0268-1145 ] ; 2002-06.

RBID : ISTEX:1D86EA4932E758629D37D52E7F01E5307CC404E2

Abstract

This paper describes the general principles, design, and present state of the Czech National Corpus (CNC) project. The corpus has been designed to provide a firm basis for the study of both the contemporary written Czech (a goal well attainable with the present resources) and the Czech language beyond the limits of contemporary written texts (a long‐term commitment including the building of a corpus of spoken Czech and diachronic and dialectal corpora). The work on the CNC project, now in the eighth year of its official existence, has resulted in the completion of SYN2000, a 100‐million‐word corpus of contemporary written Czech, the organization of the cores of spoken, diachronic, and dialectal corpora, and the finding of workable solutions to some general theoretical problems involved in the building of these corpora.

Url:

https://api.istex.fr/document/1D86EA4932E758629D37D52E7F01E5307CC404E2/fulltext/pdf

DOI: 10.1093/llc/17.2.245

Affiliations:

Links toward previous steps (curation, corpus...)

to stream Istex, to step Corpus: 000161
to stream Istex, to step Curation: 000161
to stream Istex, to step Checkpoint: 000218
to stream Main, to step Merge: 000288
to stream Main, to step Curation: 000266

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">The Czech National Corpus: Principles, Design, and Results</title>
<author wicri:is="90%"><name sortKey="Kucera, Karel" sort="Kucera, Karel" uniqKey="Kucera K" first="Karel" last="Kucera">Karel Kucera</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:1D86EA4932E758629D37D52E7F01E5307CC404E2</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1093/llc/17.2.245</idno>
<idno type="url">https://api.istex.fr/document/1D86EA4932E758629D37D52E7F01E5307CC404E2/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000161</idno>
<idno type="wicri:Area/Istex/Curation">000161</idno>
<idno type="wicri:Area/Istex/Checkpoint">000218</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000218</idno>
<idno type="wicri:doubleKey">0268-1145:2002:Kucera K:the:czech:national</idno>
<idno type="wicri:Area/Main/Merge">000288</idno>
<idno type="wicri:Area/Main/Curation">000266</idno>
<idno type="wicri:Area/Main/Exploration">000266</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">The Czech National Corpus: Principles, Design, and Results</title>
<author wicri:is="90%"><name sortKey="Kucera, Karel" sort="Kucera, Karel" uniqKey="Kucera K" first="Karel" last="Kucera">Karel Kucera</name>
<affiliation wicri:level="3"><country xml:lang="fr">République tchèque</country>
<wicri:regionArea>Charles University, Praha</wicri:regionArea>
<placeName><settlement type="city">Prague</settlement>
<region type="région" nuts="2">Bohême centrale</region>
</placeName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">Literary and Linguistic Computing</title>
<title level="j" type="abbrev">Lit Linguist Computing</title>
<idno type="ISSN">0268-1145</idno>
<idno type="eISSN">1477-4615</idno>
<imprint><publisher>Oxford University Press</publisher>
<date type="published" when="2002-06">2002-06</date>
<biblScope unit="volume">17</biblScope>
<biblScope unit="issue">2</biblScope>
<biblScope unit="page" from="245">245</biblScope>
<biblScope unit="page" to="257">257</biblScope>
</imprint>
<idno type="ISSN">0268-1145</idno>
</series>
<idno type="istex">1D86EA4932E758629D37D52E7F01E5307CC404E2</idno>
<idno type="DOI">10.1093/llc/17.2.245</idno>
<idno type="local">170245</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0268-1145</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">This paper describes the general principles, design, and present state of the Czech National Corpus (CNC) project. The corpus has been designed to provide a firm basis for the study of both the contemporary written Czech (a goal well attainable with the present resources) and the Czech language beyond the limits of contemporary written texts (a long‐term commitment including the building of a corpus of spoken Czech and diachronic and dialectal corpora). The work on the CNC project, now in the eighth year of its official existence, has resulted in the completion of SYN2000, a 100‐million‐word corpus of contemporary written Czech, the organization of the cores of spoken, diachronic, and dialectal corpora, and the finding of workable solutions to some general theoretical problems involved in the building of these corpora.</div>
</front>
</TEI>
<affiliations><list><country><li>République tchèque</li>
</country>
<region><li>Bohême centrale</li>
</region>
<settlement><li>Prague</li>
</settlement>
</list>
<tree><country name="République tchèque"><region name="Bohême centrale"><name sortKey="Kucera, Karel" sort="Kucera, Karel" uniqKey="Kucera K" first="Karel" last="Kucera">Karel Kucera</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Ticri/explor/TeiVM2/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000266 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000266 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Ticri
   |area=    TeiVM2
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:1D86EA4932E758629D37D52E7F01E5307CC404E2
   |texte=   The Czech National Corpus: Principles, Design, and Results
}}

This area was generated with Dilib version V0.6.31.
Data generation: Mon Oct 30 21:59:18 2017. Site generation: Sun Feb 11 23:16:06 2024

	Serveur d'exploration sur la TEI
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur la TEI

The Czech National Corpus: Principles, Design, and Results

The Czech National Corpus: Principles, Design, and Results

Source :

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri